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PREFACE 

From  certain  viewpoints  intercorrelation  (that  is,  correlation  "between 
independent  variables)  is  not  a  major  problem  in  statistical  analysis.  Routine 
instructions  for  solving  multiple -regression  problems  include  formulas  for  net 
regression  coefficients  and  standard  errors  which  automatically  take  account 
of  the  effects  of  intercorrelation.  Nevertheless,  research  workers  are  fre- 
quently surprised  when  two  analyses  showing  nearly  the  same  direct  correlations 
between  the  dependent  and  each  independent  variable  yield  widely  differing  net 
regression  and  multiple  correlation  coefficients. 

Many  of  the  three -variable  calculations  discussed  in  this  paper  were 
developed  by  the  senior  author  in  May  1$&7  to  explain  such  happenings  in  a 
precise  way.  Late  in  1952  the  junior  author  rechecked  and  extended  the  three- 
variable  calculations.  He  also  developed  representative  calculations  for  the 
four -variable  case,  setting  up  the  intercorrelation  formulas  in  a  matrix  nota- 
tion which  permits  generalization  to  any  number  of  variables. 

The  authors  are  indebted  to  Frederick  V.  Waugh  for  helpful  suggestions 
on  the  four -variable  and  general  cases.  The  four -variable  computations  were 
carried  out  by  Jacqueline  Spiro. 
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EFFECTS  OF  INTERCORRELATION  UPON  MULTIPLE 
CORRELATION  AND  REGRESSION  MEASURES 

hy 

Karl  A.  Fox,  Chief,  Statistical  and  Historical  Research  Branch, 
and  James  F.  Cooney,  Jr.,  Mathematical  Statistician 
Agricultural  Marketing  Service 

Trained  statisticians  have  known  the  effects  of  intercorrelation  in  a 
general  way  for  some  60  years.  They  have  given  particular  attention  to  the 
extreme  case  ("multicollinearity")  in  which  two  or  more  independent  variables 
are  so  highly  correlated  that  their  separate  effects  cannot  be  distinguished. 
At  the  other  extreme,  where  there  is  no  intercorrelation,  the  effects  of  the 
different  independent  variables  are  strictly  additive. 

Many  users  of  the  graphic  method  of  regression  analysis  know  that  inter- 
correlation between  independent  variables  tends  to  delay  the  convergence  of 
successive  graphic  approximations  toward  the  mathematical  solution. 

Trained  statisticians  are  also  aware  that  increasing  levels  of  intercor- 
relation are  reflected  in  increasing  standard  errors  of  net  regression  coeffi- 
cients --that  is,  high  intercorrelation  tends  to  mean  lowered  reliability  for 
the  individual  regression  constants.  But  apparently  it  is  safe  to  say  that 
most  students  of  elementary  statistics  and  most  persons  who  make  regression 
analyses  as  an  adjunct  to  their  applied  work  have  only  a  vague  idea  of  the 
effects  of  intercorrelation  and  frequently  get  results  from  multiple-correla- 
tion analyses  that  they  are  unable  to  explain.  More  concrete  information  on 
the  effects  of  intercorrelation  through  its  whole  range  of  variation  and  not 
merely  at  the  points  0  and  1  is  therefore  expected  to  be  useful.  This  infor- 
mation for  the  three -variable  case  is  shown  by  means  of  charts  and  tables  for 
a  number  of  pairs  of  values  of  the  simple  (or  gross)  correlation  coefficients 
between  the  dependent  variable  and  each  of  the  two  independent  variables.  The 
four-variable  case  is  treated  for  a  more  limited  range  of  numerical  values  but 
in  a  notation  which  permits  generalization  of  the  results  to  any  number  of 
variables . 

SUMMARY 

With  given  values  for  r^  aj3d  ri3  and  a  specified  size  of  sample,  all  of 
the  correlation  measures  in  a  three -variable  problem  can  be  expressed  simply 
as  functions  of  ^3.  In  certain  cases  as  ^3  increases  the  coefficient  of 
multiple  determination  declines  continuously.  In  other  cases  it  trends  down 
to  a  minimum  value  and  then  increases.  For  given  values  of  r^2  and  r^,  r23 
can  take  only  a  limited  range  of  values .  In  some  cases  -as  the  degree  of 
intercorrelation  increases  beyond  a  certain  level,  the  "weaker"  of  the  two 
partial  regression  coefficients  changes  sign.  Within  a  considerable  range  of 
values  of  w  in  the  region  of  this  sign  change,  the  value  of  the  correspond- 
ing partial  regression  coefficient,  for  samples  of  about  20  observations,  does 
not  differ  significantly  from  zero. 
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The  four -variable  case  is  more  complicated  because  six  simple  correlation 
coefficients  are  involved.  The  same  approach  can  be  used,  however,  by  speci- 
fying five  of  these  and  allowing  the  sixth  one  (r^lj.)  to  vary  over  its  entire 
range  of  possible  values.  When  each  of  the  three  simple  intercorrelation 
coefficients  is  very  high  the  values  of  the  partial  regression  coefficients 
are  very  unstable  and,  in  samples  of  about  20  observations,  are  smaller  than 
their  standard  error  in  all  but  a  small  portion  of  the  range  of  permissible 
values  of  To]^.     As  ^3  and  rgl^  take  on  smaller  values,  the  partial  regression 
coefficients  acquire  greater  stability  and  exceed  their  standard  error  over  a 
large  part  of  the  permissible  range  of  r^ij.. 

EFFECTS  OF  INTERCORRELATION  IN  THE  THREE -VARIABLE  CASE 

The  Approach 

The  general  problem  of  three -variable  regression  analysis  is  to  estimate 
the  values  of  a  dependent  variable,  X^,  based  on  given  values  of  two  independ- 
ent variables,  Xg  and  X3.  Assume  that  we  have  already  calculated  the  direct 
correlation  coefficients  (ri2  and  r^)  between  X^  and  Xg  on  the  one  hand  and 
Xi  and  X3  on  the  other  for  a  number  of  different  problems.  Suppose  that  sev- 
eral of  the  analyses,  based  on  entirely  different  sets  of  data,  have  yielded 
the  same  values  of  r^  and  rl3«  Nevertheless,  in  each  case,  we  may  obtain 
different  values  for  the  multiple  and  partial  correlation  coefficients  and  for 
the  net  regression,  or  beta,  coefficients.  The  coefficient  of  multiple  deter- 
mination, Ri?23>  may  vary  from  s^11003^  1*°  down  "to  °*5>  or  lower. 

We  then  ask,  Why  do  these  differences  occur?  In  the  three -variable  case, 
these  variations  can  be  wholly  explained  by  variations  in  the  value  of  the 
intercorrelation  coefficient,  ^3,  between  the  independent  variables  X2  and  X3. 

Basic  Formulas 

Several  standard  methods  of  calculating  multiple  correlation  and  regression 
coefficients  start  out  from  the  determinant  of  simple  correlation  coefficients, 
which  for  the  three -variable  case  is  as  follows: 


A 


1  r12  rl3 
r12  x  r23 
r13  r23  1 


1  +  2  ri2  r13  r23  "  r122  "  r132  "  r232  '    (1) 


All  the  basic  correlation  and  regression  measures  in  the  three -variable  case 
can  be  derived  from  the  values  of  the  three  simple  correlation  coefficients, 
with  the  following  exceptions  which  are  trivial  in  the  present  context:  First, 
if  we  wish  to  talk  about  net  regression  coefficients  in  terms  of  original  units 
(pounds,  dollars,  and  so  on)  rather  than  normalized  or  standard  deviation  units 
we  must  multiply  the  beta  coefficients  by  the  ratio  of  the  standard  deviation 
of  the  dependent  variable  to  that  of  the  particular  independent  variable  con- 
cerned. Second,  the  standard  error  of  the  beta  coefficient,  or  of  the  corre- 
sponding net  regression  coefficient,  is  affected  by  the  number  of  observations 
in  the  sample. 
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Equations  (2)  through  (7)  define  the  various  correlation  and  regression 
measures  in  terms  of  the  three  simple  correlation  coefficients: 

2    =*   r12  +  r13  "  ^12   r13  r23  (2) 


R 

r23 


1.23   ~         I   _ 2 


^12.3  -  T-^^  <3> 

1  -  r23 

b12.3  =  ^12.3  '  Sg"  '  (3.1) 

where  S-,   and  S2  are  the  standard  deviations  of  X-^  and  X2  respectively. 

£13.2 .  5.-1^3^23  w 

1  -r23 

sl 
b13.2  =  #13.2  •  oT  '  (^•1) 


3 


where  So  is  the  standard  deviation  of  Xo. 


^12  -  r13  r23  ,_v 

r12.3  =  , (5) 

^vr(i-r132) 


*.          =      r13  "  r12  r23  /£\ 

r13.2  ■  ■ ■ (6> 

N/d-r232)   (l-r122) 


\/  1-p-l?23  vin-2r1or13r23-r122-r132>r232^ 

P12.3        P13.2       (l-r|3)  v/N^f  (l-r23)  VNT 


Sl   ^       =  Sfi  .  «i  ,   and  (7.1) 

b12.3         P12.3       %2 

S,  =  Sft  .  !i  .  (7.2) 

b13.2        P13.2       s3 
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It  ssay  "be  seen  from  these  formulas  that  once  we  have  fixed  the  values  of  rj£ 
and  ?y.-»>   the  various  measures  can  be  expressed  simply  as  functions  of  ^3.  In 
the  charts  and  tables  that  follow,  the  value  of  each  correlation  measure  was 
calculated  for  series  of  values  of  the  intercorrelation  coefficient  r23  cover- 
ing all  or  nearly  all  of  the  range  of  possible  values  of  that  coefficient, 
given  the  stated  values  of  r^  and  r^.  l/ 

Discussion  of  Charts  and  Tables 

In  figure  1  the  values  of  Tj2  and  r13  were  chosen  in  such  a  way  that  the 
coefficient  of  multiple  determination  is  0.99  when  the  intercorrelation  coeffi- 
cient is  zero.  As  ^3  increases,  the  value  of  ^1^23  declines  continuously, 
approaching  a  lower  limit  of  0.^9  as  r23  approaches  1.  At  this  point  the  var- 
iable X3  adds  nothing  to  the  explanation  of  X}.  that  is  not  already  given  by 
the  single  independent  variable  X2.  The  partial  correlation  coefficient  ri2.3 
decreases  continuously  as  r23  increases,  approaching  zero  as  r23  approaches  1. 
Ths  beta  coefficient  also  decreases  through  this  entire  range  and  its  standard 
error  increases.  By  the  time  ^3  exceeds  0.7,  the  beta  coefficient  (based  on 
an  assumed  20  observations)  is  no  longer  significantly  different  from  zero  at 
the  coaaaonly  used  5 -percent  probability  level. 

Figure  2  illustrates  the  fact  that  the  values  of  rj£  and  rj*  set  certain 
limits  upon  the  range  of  values  which  ^3  may  take.  Obviously,  If  X2  and  Xo 
are  both  closely  correlated  with  X^  they  have  some  degree  of  correlation  with 
each  other.  The  exact  nature  of  the  limits  set  upon  r23  by  the  values  of  ri2 
and  rjo  is  shown  in  appendix  note  1.  In  this  particular  case,  ^3  cannot  be 
lower  than  0.62. 

Figure  3  shows  a  result  that  may  be  surprising  to  many  applied  workers. 
As  intercorrelation  increases  beyond  a  certain  level,  the  "weaker"  of  the  two 
partial  regression  coefficients  changes  sign  from  positive  to  negative.  This 

1/  In  the  charts  and  tables  that  follow,  r^2  and  r^  are  always  taken  as 
positive,  and  the  corresponding  values  of  r23  and  other  measuree  are  predom- 
inantly positive.  If  the  same  absolute  values  of  rj2  sad  r^o  are  taken  with 
negative  6igns,  the  corresponding  values  of  R1.23*  r23>  s^ip  v  an<*  s£iq  o  are 

the  same  as  before;  absolute  values  of  £12. 3,  ^13. 2*   r12.3  an^  r13.2  are  *^e 
same  as  before  but  with  the  opposite  sign.  If  we  take  r^2  positive  and  r^3 
negative,  ^3  will  be  predominantly  negative.  Values  of  R3..23*  s£tp  V  S^13  2' 

P12.3  and  r^^  will  be  the  same  as  in  the  first  case  (r^g  and  rjo  both  posi- 
tive); and  the  absolute  values  of  £13.2  and  r13.2  viH  he  the  same  as  in  the 
first  case  but  with  opposite  sign.  Finally,  if  we  take  r^2  negative  and  ri3 
positive,  the  values  of  r23  will  be  predominantly  negative;  those  of  1*1^23* 
So    ,  Sg    ,  ^13.2  and  r13.2  "will  be  the  same  as  in  the  first  case; 'and 

the  absolute  values  of  #12.3  and  r^2  3  will  be  the  same  as  in  the  first  case 
but  with  opposite  sign.  As  the  absolute  values  of  all  the  measures  are  un- 
changed by  these  interchanges  of  sigpis,  the  figures  tabulated  below  each  chart 
can  be  used  for  all  four  cases  with  appropriate  changes  in  signs. 
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change  in  sign  occurs  at  a  value  of  r23  somewhat  above  the  value  of  the  lover 
of  the  tvo  single  correlation  coefficients,  r^2  and  r^.  Within  a  consider- 
able range  of  values  of  ^3  in  the  region  of  this  sign-change,  the  value  of 
the  corresponding  beta  coefficient  would  not  differ  significantly  from  zero. 
Other  features  illustrated  in  figure  3  are  (l)  that  the  "stronger"  of  the  two 
regression  coefficients  increases  for  a  time  as  the  intercorrelation  increases, 
and  (2)  that  the  coefficient  of  multiple  determination  trends  down  to  a  mini- 
mum at  some  value  of  ^3  greater  than  the  lower  of  the  tvo  direct  coefficients 
and  then  increases  again. 

The  characteristics  of  figure  3  are  repeated  la  the  data  shown  in  tables 
k  and  5  and  in  figure  h.     Each  shows  minimum  values  2/  for  the  coefficient  of 
multiple  determination,  Rif23>  the  stronger  partial  correlation  coefficient, 
r^2„3>  and  "the  stronger  regression  coefficient,  £12 . 3*  and  each  shows  a  sign 
change  for  the  weaker  coefficients,  ^13.2  and  T±?t2*    Afl  *^e  spread  between 
ri2  and  rj_o  increases,  so  also  does  the  range  of  permissible  values  of  ^3. 
When  rjo   falls  to  0.3>  ^3  can  take  on  small  negative  values  as  well  as  posi- 
tive values.  If  rj3  equals  0.1,  then  ^3  can  take  values  slightly  lover  than 
-0.3. 

The  summary  tables  contain  values  of  the  "t-ratiosM,  that  is,  ratios  of 
the  respective  net  regression  coefficients  to  their  standard  errors.  As  in 
each  of  the  last  four  cases  ^12.3  nas  a  minimum  and  So     has  a  mximum,  the 
corresponding  t-ratio  has  a  minimum  beyond  which  it  rises  again  with  further 
increases  in  ^3* 


2/  That  is,  minima  in  the  mathematical  sense  of  points  at  which  the  slope 
with  respect  to  ^3  is  zero  and  becomes  positive  as  ^3  increases  and  negative 
as  r23  decreases.  Further  information  on  these  minimum  values  is  given  in 
appendix  note  2. 
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EFFECTS  OF  INTERCORRELATION 

3-Variable  Case:  ri2  =  .7;  n3  =  .7;  N  =  20 
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Figure  1 


Table  1.-  Data  for  case  in  vhlch  r^o  ■  0.7,  r,,  ■  0.7,  and  M  -  20 


t -ratio  for  - 

:         R2 

:       w1.23 

p12.3 

B13.2 

r12.3 

r13.2 

SP 

*23 

u 

p12.3 

;     p13.2 

-0.020  2/ 

;     1.0000 

0.73A3 

0.711*3 

1.0000 

1.0000 

0 

... 

0 

.9800 

.7000 

.7000 

.9803 

.9803 

0.031*6 

20.2312 

20.2312 

.100 

.8909 

.6361+ 

.636J+ 

.8866 

.8866 

.0806 

7.8958 

7.8958 

.200 

.8167 

.5833 

.5833 

.8003 

.8003 

.1058 

5.5132 

5.5132 

.300 

.7538 

.5385 

.5385 

.7193 

.7193 

.1261 

l*.270l* 

1*.2704 

.1*00 

.7000 

.5000 

.5000 

.61*17 

.61*17 

.11*1*9 

3-1*507 

3.!*507 

•  500 

.6533 

.1*667 

.1*667 

.5659 

.5659 

.161*9 

2.8302 

2.8302 

.600         i 

.6125 

.1*375 

.1*375 

.1*901 

.1*901 

.1887 

2.3185 

2.3185 

.700 

.5765 

.1*118 

.1*118 

.1*118 

.1*118 

.2209 

1.861*2 

1.861*2 

.800 

.5khk 

.3889 

.3889 

.3267 

.3267 

.2728 

1.1*256 

1.1*256 

.900         : 

.5158 

.368I+ 

.3681* 

.221*9 

.221*9 

.3872 

•  9511* 

•  9511* 

.950         : 

.5026 

.3590 

•  3590 

.1570 

.1570 

.51*78 

.6553 

•6553 

.980         : 

.49^9 

•  3535 

•  3535 

.0985 

.0985 

.8655 

.1*081* 

.1*081* 

.990          : 

.4925 

.3518 

.3518 

.0695 

.0695 

I.2278 

.2865 

.2865 

.999         : 

.^902 

•  3502 

.3502 

.0219 

.0219 

3.1623 

.1107 

,.1107 

1.000  3/  : 

— 

— 

~~" 

— 

~~" 

— 

0 

0 

1/  Identical  for  each  &. 

2/  Lowest  possible  value  of  r23» 

3_/  Highest  possible  value  of  ^3. 
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3-Variable  Case:  Ti2=  .9;    r,3=.9;   N  =20 
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Figure  2 


Table  2.-  Data  for  case  In  which  r-^g  =  0.9,  r13  =  0.9,  and  N  =  20 


t-ratio 

for  - 

:         R2 

012.3 

013-2 

r12.3 

r13.2 

1/ 

r23 

ei2.3       = 

P13-2 

0.620  2/ 

j      1.0000 

0.5556 

0.5556 

1.0000 

1.0000 

0 

... 

___ 

.700 

:        .9529 

.5291* 

.5294 

.8710 

.8710 

0.0735 

7.2073 

7.2073 

.800 

:        .9000 

.5000 

.5000 

.6883 

.6883 

.1277 

3.9154 

3-9154 

.900 

.8526 

.4737 

.4737 

.4737 

.4737 

.2135 

2.2183 

2.2183 

.950 

.8308 

.4615 

.4615 

.3306 

•3306 

.3195 

1.4444 

1.4444 

.980 

.8182 

.4545 

.4545 

.2076 

.2076 

.5193 

.8752 

•  8752 

.990     ! 

.811*1 

.  1*523 

•  ^523 

.1W3 

.1463 

.7431 

.6087 

.6087 

•  999        : 

.8000 

.1+500 

.U500 

.0462 

.0462 

2.0000 

.2250 

.2250 

1.000  3_/   : 

— 

... 

... 

... 

... 



... 

— 

l/  Identical  for  each  p. 

2/  Lowest  possible  value  of  ^3. 

3_/  Highest  possible  value  of  *•£$■ 
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EFFECTS  OF  INTERCORRELATION 

3-Variable  Case:  ri2  =  .9;  m  =.7;    N  =  20 


(  r23 

) 

/ 

1.2 

-.8 

A 

> 

R2 

■>    1.23 

V 

\ 

13.2        N 

.3 

-g 

.3 

.4 
f\  - 

s. 

[2t 

U 

A 

\ 

-.  *\ 

\ 

2        0        .2        .4         .6         .8        1.0 

l°23 


U.  S.  DEPARTMENT  OF  AGRICULTURE 


NEG     70-53(11)      AGRICULTURAL    MARKETING    SERVICE 


Figure  3 


Table  3.-  Data  for  ca6e  In  which  r^  ■  0.9,  r™   "  °.7  «"!•  N  «  20 


t-ratio  for  - 

:      *1.23 

012.3 

p13.2 

r12.3 

f13.2 

1/ 

r23 

P12.3 

p13.2 

O.3187  2/ 

1.0000 

0.7535 

0. 1*599 

1.0000 

1.0000 

0 

... 

... 

.1*000         : 

.9*88 

.7381 

.1*01*8 

•9*73 

.8511 

0.0599 

12.3222 

6.7579 

.4500         : 

.9191 

.7335 

.3695 

.9172 

.7578 

.0772 

9.5013 

*.7863 

.5000          : 

.8933 

.7333 

.3333 

.8892 

.6623 

.0917 

7.9967 

3.63*7 

.5500          : 

.8703 

•738iv 

.2939 

.8635 

.5632 

.101*6 

7.0593 

2.8098 

.6000         i 

.8500 

.7500 

.2500 

.8^02 

.1*588 

.1175 

6.3830 

2.I277 

.6500          : 

.8329 

.7706 

.1991 

.8200 

.3*72 

.1305 

5.9050 

1.5257 

.7000         : 

.8196 

.8039 

.1373 

.8039 

.221*9 

.11*1*2 

5-57*9 

.9521 

.7500         : 

.811k 

.8571 

.0571 

.7938 

.0867 

.1592 

5.3838 

.3587 

.8000          : 

.8111 

.9*** 

-.0556 

.7935 

-.0765 

.1758 

5.3720 

-.3163 

.85OO          : 

.8252 

I.0991 

-.23U2 

.8107 

- .2831 

.1925 

5.7096 

-1.2166 

.9000          : 

.8737 

1.1»211 

-.57&9 

.8673 

-.5789 

.1977 

7.1882 

-2.9282 

•  9*13  37   : 

1.0000 

2. 11U9 

-1.2912 

1.0000 

-1.0000 

0 

... 

l/  Identical  for  each  p  . 

2/  Lowest  possible  value  of  r_, 

2/  Highest  possible  value  of  r2_ , 
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Table  4.-  Data  for  case  in  which  r1?  =  0.9,  ru  «  0.5  and  N  »  20 


13 


r 

\    R2      J 

Q 

!   sQ   ! 

t -ratio  for  - 

23 

\    1.23  J 

p12.3  ' 

p13-2 

rl2.3 

:  r13-2 

p   < 

;    1/    : 

1 

p12.3   5 

p13-2 

O.OT25  2/ 

:  1.0000 

0.8684 

O.4371 

1.0000 

1.0000 

0 

.1000 

:  .9798 

.8586 

.4141 

.9864 

.9454 

0.0346 

24.8150 

II.9682 

.2000 

:  .9167 

.8333 

•  3333 

.9428 

.7492 

.0714 

11.6709 

4.6681 

•3000    : 

!   .8681 

.8242 

.2527 

.9079 

•5532 

.0922 

8-9393 

2.7408 

.4000    : 

:  .8333 

.8333 

.1667 

.8819 

.3504 

.1082 

7.7015 

1.5407 

•5000    : 

.8133 

.8667 

-0667 

.8667 

.1325 

.1208 

7.1747 

.5522 

.6000     ! 

.8125 

•  9375 

-.0625 

.8661 

-.1147 

.1311 

7.1510 

-.4767 

.7000    : 

.8431 

1.0784 

~.2549 

.8892 

-.4176 

.1345 

8.0178 

-I.8952 

.8000     ! 

•  9444 

1.3889 

-.6111 

.9623 

-.8413 

.0954 

14.5587 

-6.4057 

.8275  ll   ! 

1.0000 

1.5428 

-.7766 

l.OCOO 

-1.0000 

0 

... 

... 

l/  Identical  for  each  P» 

2/  Lowest  possible  value  of  r23« 

3/  Highest  possible  value  of  ^3. 


Table  5.. 

-  Data  for 

case  in  which  r^ 

=  0.9,  r1a  =0.3  and  N  ■  20 

1 

t 

» 
1 

! 

:  t -ratio 

for  * 

!  Rl-23  , 

!  p12.3  , 

;  P13.2 

r12.3 

',     rl3-2 

S0     ! 

•    1/     ' 

r23 

!  P12.3  : 

P13.2 

-0.1458  2/ 

:  1.0000 

0.9642 

0.4406 

1.0000 

1.0000 

0 

... 

... 

-.1000 

:  .9636 

.9394 

•  3939 

.9798 

.8992 

0.0465 

20.2022 

8.4710 

0 

;   .9000 

.9000 

.3000 

.9435 

.6882 

.0767 

11.7340 

3.9113 

.1000 

:  .8545 

.8788 

.2121 

.9166 

.4842 

.0930 

9.4495 

2.2806 

.2000   . 

:  .8250 

.8750 

.1250 

18987 

.2810 

.1035 

8.4541 

1.2077 

.3OOO   . 

;   .8110 

.8901 

.0330 

.8901 

.0722 

.1105 

8.0552 

.2986 

.4000   ! 

.8143 

.9286 

-.0714 

.8921 

-.1502 

.1141 

8.1385 

-.6258 

.5000  \ 

•  .8400 

1.0000 

-.2000 

.9079 

-3974 

.1120 

8.9286 

-1.7857 

.6000   j 

.9000 

1.1250 

-3750 

.9439 

-.6883 

.0959 

11.7310 

-3.9103 

.6858  3/ 

:  1.0000 

1.3107 

-5988 

1.0000 

-1.0000 

0 

... 

... 

1/  Identical  for  each  3. 

2/  Lowest  possible  value  of  ^23. 

2/  Highest  possible  value  of  r23. 
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Table  6.-  Data  for  case  in  which  r^  ■  0.9,  r^o  ■  O'.l,  and  1-20 


1       R2 
:       R1.23 

P12.3 

e13.2 

r12.3 

r13.2 

1/ 

t -ratio  for  - 

r23 

012.3 

:    p13.2 

-0.31*37  2/ 

1.0000 

1.0595 

0.1*61*1 

1.0000 

1.0000 

0 

... 

— 

-.3000 

.9604 

1.0220 

.1*066 

.9798 

.8899 

0.0510 

20.0392 

7.9725 

-.2000 

.8917 

.9583 

.2917 

•  9^37 

.6556 

.0812 

11.8017 

3.5921* 

-.1000 

.8W5 

.9192 

.1919 

.9192 

.1*381 

.1010 

9.1010 

1.9000 

0 

.8200 

.9000 

.1000 

.901*5 

•  2291* 

.1030 

8.7379 

.9709 

.1000 

.8101 

.8990 

.0101 

.8990 

.0231 

.1122 

8.0125 

.0900 

.2000 

.8167 

.9166 

-.0833 

.9027 

-.1873 

.1058 

8.6635 

-.7873 

•  3000 

.81*18 

.9560 

-.1868 

.9166 

-.1*089 

.1010 

9.^653 

-1.81*95 

.1*000 

.8930 

1.0000 

-•3023 

.9319 

-.61*32 

.085!* 

II.7096 

-3.5398 

.5000 

•9733 

1.1333 

-.1*667 

.9861* 

-.9272 

.OU58 

2l*.  7^5 

10.1900 

•5237  3/: 

1.0000 

1.1680 

-.5116 

1.0000 

-1.0000 

0 

— 

... 

1/  Identical  for  each  p . 

2/  Lowest  possible  value  of  ^3. 

2/   Highest  possible  value  of  r2o« 
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EFFECTS  OF  INTEFCOPKELATION  IN  THE  FOUR -VARIABLE  CASE 

Differences  from  the  Three -Variable  Case 

The  problem  of  intercorrelation  in  the  four-variable  case  is  much  more 
complicated  because  there  are  now  three  intercorrelation  coefficients  instead 
of  one.  We  have  three  independent  variables,  X?,  X_,  and  X,  ,  and  we  may  have 

intercorrelation  between  X2  and  X~,  X-  and  X^,  and  X~  and  X. . 

Following  the  approach  used  in  the  three-variable  case,  let  us  suppose 
that  we  have  a  large  number  of  four-variable  regression  analyses  on  different 
sets  of  data.  We  select  a  number  of  these  analyses  in  which  the  values  of 
r12»  rlV  an'^  rlU  (t*16  direct  or  simple  correlation  coefficients  between  the 
dependent  and  each  independent  variable)  are  about  the  same.  Nevertheless, 
we  find  that  the  partial  correlation  and  net  regression  coefficients  are 
different  in  each  case.  These  differences  are  due  to  the  varying  degrees  of 
intercorrelation,  represented  by  combinations  of  values  of  r„_,  r^j,,  and  r^^. 

A  systematic  exploration  of  -the  effects  of  intercorrelation  in  the 
four-variable  case  would  Involve  a  great  deal  of  labor.  One  possibility 
would  be  to  fix  the  values  of  r2o  ®n&   r^,  and  to  trace  the  effects  of  variations 
in  r««  upon  the  different  regression  measures.  Except  for  a  change  in 
notation,  such  a  demonstration  would  apply  equally  well  to  changes  in  either 
of  the  other  intercorrelation  coefficients,  r2o  or  r2«  .  Before  doing  this, 
however,  we  shall  illustrate  the  complications  of  the  four-variable  case  in 
terms  of  the  basic  formulas  for  correlation  and  regression  coefficients. 

Basic  Formulas 

It  will  be  convenient  at  this  point  to  introduce  a  determinant  notation, 
which  avoids  excessive  rewriting  of  the  simple  correlation  coefficients. 
This  notation  can  be  extended  to  five  or  more  variables,  and  also  to  the 
three -variable  case  previously  considered. 

In  the  three -variable  case,  the  three-rowed  determinant  of  correlation 
coefficients 


A- 


1  r12 

r13 

r12  1 

r23 

r13  r23 

1 

(1) 


can  be  made  to  yield  nine  different  two-rowed  determinants  by  deleting  one 
column  and  one  row  of  /y\.     Suppose  we  call  the  two-row  determinant  obtained 
by  deleting  the  first  column  and  the  first  row^,,  (=  1  -  r_-2).  that  obtained 
by  deleting  the  first  column  and  the  second  row  .A^  (=  r,p  -  r,_  roo)  >   and 
so  on.  The  complete  set  of  two-row  determinants,  whiGh  we  call  thej^,.'s,  is 
as  follows: 
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A  u  -  X  -  r2*  (1.1) 

A  22  -  1  "  r132  (1.2) 

A  33  -  1  "  *■&  (1-3) 

A  12  '  ri2  -  ri3  r23  "  AS1  (l.» 

A  13  =  -  <ri3  -  ri2  r23'  "^3i  (1-5) 

^23  =  (r23  ■  rlS  rl3}  =  A32  (l.«) 

All  of  the  formulas  for  correlation  and  regression  measures  given  in  the 
preceding  section  for  the  three -variable  case  can  be  stated  in  terms  of  /f\   and 
the  /f\. . *8,  as  follows: 

*x%   -  1  "^L  (»') 


012.3  ~  A15  (3') 

013.2  =  *  A13  (U', 

4i 

rl2.3  =         ^12  (5') 


^U'   ^22 

^  •  A33 
Pi->  3     p. 


(7) 


*-3        P13.2  &n    iHfi 


Once  we  have  fixed  the  values  of  iv   and  r  ,   /5\  and  all  but  two  of 

A  12      1  ^ 

•s  are  functions  only  of  r  ;  these  two,  /j\   and^   ,  are  constants 
ij                         2j  22       33 

Each  of  the  six  formulas  in  this  paragraph  involves/jl  ,  which  changes  with  r  > 
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and  another  determinant  which  also  changes  with  r  .  This  means  that  we 
cannot  vary  r   arbitrarily  without  consistently  varying  /3\  and  the  /3\  's. 

In  the  four-variahle  case,  the  determinant  of  correlation  coefficients  is 


A- 


Ik    x2h   3^ 
The  determinant  of  the  three  intercorrelation  coefficients  is 


x      r    r 
12   13 

rlU 

ri2  *   r23 

r2k 

r    r^^  1 
13   23 

r3k 

r_.   r  .   r_,   1 


(8) 


K 


1  r^   r  , 
23  2k 


T231 


r3^ 
T2k  r3k      1 

Each  of  the  15  other  possihleZ^s . 


1  +  2  r 


23  r2k     rZk 


(9) 


2      2  . 
-  r««  -  r„ 


23    *2l^     31*  * 
's  is  now  also  a  three-rowed  determinant. 


The  formulas  in  the  preceding  paragraph  still  apply,  with  an  appropriate 
change  in  notation:  For  example, 

ri2  r23  r2k 


P 


A 


12 


■13 


rl*  r3k 


3k 
1 


(10) 


12. 3^ 


A- 


11 


23 


23  r2lf 
r 


3* 
r2U  r34   X 


All  of  the  /4\  's  for  which  i  V  J  involve  all  three  intercorrelation 
coefficients.  '\>? »  /l\«)  and A\.  .  each  contain  only  one  of  these  coefficients 

But  this  last  point  is  not  very  helpful,  as  each  formula  (2)  through  (7)  in- 
cludes either /\itself  or  aA   for  which  i  V  J. 
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We  noted  in  the  three -variable  case  that  the  values  assumed  for  r   and 
r,,  impose  certain  limits  upon  the  values  which  might  he  assumed  hy  i\,. 
Similarly,  the  values  assumed  for  r,2  and  r^  impose  restrictions  on  r^t   and 
those  assumed  for  r,o  and  r...  impose  restrictions  on  r_^.  For  example,  If 
rlP  =  rl^  =  rlk   =  °''>  oach  of  the  tnree  intercorrelation  coefficients  may  range 
from  -  0.02  to  1.0.  However,  the  values  of  r2^  and  r^  also  set  limits  to  the 
permissible  values  of  r^.  a  consistent  set  of  limits  for  the  six  simple  r's 
can  he  derived  from  the  fact  that>A  22  "^33 » A  kk ,  and^^  must  all  lie 
between  0  and  1 

Discussion  of  Charts  and  Tables 

Figures  5  through  10  and  tables  7  through  13  provide  some  Insights  into 
the  effects  of  intercorrelation  in  the  four-variable  case.  The  first  five 
cases  assume  that  all  three  of  the  direct  correlation  coefficients  r,p,  r,~,  and 

*2.k   are  e<lual  "to  °«7»  Tw°  °*   the  intercorrelation  coefficients,  r2o  and  rgj,* 
are  then  set  equal  to  0.9,  0.7,  0.5,  0.3  and  0.1,  respectively.  In  each  case, 
the  third  intercorrelation  coefficient,  r,^,  is  allowed  to  vary  over  its  entire 
range  of  possible  values  given  the  values  of  the  other  five  coefficients  and 
the  basic  requirements  E  oou^  1  *&&   I  rokl  ^  ^ 

The  fact  that  we  have  set  all  three  of  the  direct  coefficients  equal  to 
one  another,  and  two  of  the  intercorrelation  coefficients  equal  to  each  other, 
produces  several  symmetries  in  the  results.  One  is  that^-j,  2^  *nd.fi^  g~  are 
equal  in  each  case.  Another  is  that  at  the  point  where  r^k  is  equal  to  rgo  and 
T2k>   a^  three  beta  coefficients  are  equal. 

Figure  5  reflects  a  very  high  degree  of  intercorrelation.  One  symptom  of 
this  is  the  fact  that/^^,  the  determinant  of  intercorrelation  coefficients, 
takes  on  very  small  values —  O.036  or  less--over  its  entire  range.  In  figure 
10,  in  contrast,  the  value  of/$S,jj  reaches  a  peak  of  1.0  when  all  three  inter- 
correlation coefficients  are  zero,  and  exceeds  0.5  over  a  considerable  range  of 
values  of  r^K.  In  fact,  figure  5  approaches  the  extreme  of  umlticol  1 1  Ecarity  to 
which  FriscH  gave  so  much  attention  in  the  early  thirties.  The  values  of  the 
beta  coefficients  are  very  unstable  and  are  smaller  than  their  standard  errors 
in  all  but  a  small  portion  of  the  range  of  permissible  values  of  r,r .  And  the 
range  of  permissible  values  of  r_.  is  limited. 

Figure  6,  in  which  r—  and  rgi  equal  0.7,  shows  a  greater  stability  of  the 

beta  coefficients  with  respect  to  given  changes  in  r-^  than  does  figure  5.  The 
standard  error  of  the  beta  coefficients  is  also  more  stable  than  in  the  pre- 
ceding chart .  The  beta  coefficients  exceed  their  standard  errors  over  a 
considerably  wider  range  of  values,  although  they  do  not  reach  twice  the  level 
of  their  standard  errors  anywhere  in  the  permissible  range.  In  both  of  these 
figures  the  behavior  otfi^   ok  corresponds  to  that  of  the  weaker  coefficient 
in  some  of  the  three -variabli  charts.  When  the  level  of  r^  drops  significantly 
below  the  levels  of  r2,  and  T2k>fii2  Ik   chanSes  sign  from  positive  to  negative. 
Visually,  it  appears  that  ^13.2^  (3  f  lk.23)  *s  a  reflection  of  0^  ok  about  the 
particular  level  at  which  all  three  intercorrelation  coefficients,  and  hence  all 
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three  "beta  coefficients,  axe  equal.  The  value  of  the  betas  at  this  point  of 
equality  increases  from  one  case  to  the  next  as  the  paired  intercorrelation 
coefficients  (r2_  and  r2^)  decrease. 

Figure  7  shows  still  greater  stability  in  the  values  of  the  beta 
coefficients  and  their  standard  errors.  The  beta  coefficients  exceed  their 
standard  errors  over  a  large  part  of  the  permissible  range  and  for  certain 
limited  values  of  r^,  near  zero,  they  exceed  two  standard  errors. 

The  data  given  in  table  10  show  increasing  stability  of  the  beta 
coefficients  and  their  standard  errors  within  the  range  of  permissible  values. 
However,  that  range  itself  is  somewhat  reduced.  Apparently,  as  rp,  and  r^u   are 

lowered,  r,.  must  remain  significantly  above  zero  if  the  other  constraints  on 
the  various  correlation  measures  are  to  be  met.  If  all  three  intercorrelation 
coefficients  were  zero  the  coefficient  of  multiple  determination,  R  |  , ,  should 
be  equal  to  the  sum  of  squares  of  the  three  direct  correlation  coefricIents--in 
this  case,  3  *  (0.7)2  =  1.1*7.  As  this  is  an  impossible  value  of  R  |  .   the 
value  of  r~^  which  leads  to  it  is  not  permissible.  While  the  ratios  of  the  beta 
coefficients  to  their  standard  errors  are  greater  than  1  over  most  of  the  range 
of  permissible  values,  at  no  point  does  any  one  of  these  ratios  exceed  2.0. 

Figure  8  shows  a  still  greater  contraction  of  the  range  of  permissible 

values  of  r^.  The  values  of  the  beta  coefficients  are  quite  stable  within 

this  limited  range,  but  the  standard  errors  of  these  coefficients  is  changing 

rapidly  within  it.  The  t-ratio,  $/Sa  ,   for^.p  -^  exceeds  2.0  toward  the  lower 

end  of  the  permissible  range  of  r^j^;  t-ratios  for  the  other  beta  coefficients 
do  not  exceed  1.3  at  any  value  of  r^. 

It  is  evident  from  the  above  results  that  to  have  each  of  the  3  direct 
correlation  coefficients  equal  to  0.7  already  constitutes  a  high  degree  of 
intercorrelation  if  one  hopes  to  achieve  significant  regression  coefficients 
in  a  four-variable  equation  involving  only  20  or  so  observations. 

In  figure  9,   the  direct  correlation  coefficients  are  reduced  to  0.5  and 
two  of  the  intercorrelation  coefficients  are  also  set  equal  to  0.5.  This  chart 
may  be  compared  with  that  of  figure  6,  in  which  all  5  of  these  coefficients  were 
set  equal  to  0.7.  The  beta  coefficients  and  their  standard  errors  in  figure  9 
are  considerably  more  stable  and  cover  a  wider  range  of  permissible  values  than 
in  figure  6.  The  coefficients  exceed  their  standard  errors  over  most  of  the 
permissible  range,  and  the  t-ratios  for^^o  2U  an<i  ^lU  23  ©xcoe(i  2.0  over  a 
sizable  range,  reaching  a  maximum  of  2.83  when  r^u  reaches  its  lowest  value. 

In  figure  10  the  three  direct  correlation  coefficients  are  again  set  equal 
to  0.5  and  two  of  the  intercorrelation  coefficients  are  set  at  zero.  The  range 
of  permissible  values  of  r-^  is  about  the  same  as  in  figure  9  but  the  degree  of 
stability  of  the  beta  coefficients  and  their  standard  errors  is  considerably 
greater.  The  coefficient  3 ,«  04  is  independent  of  r~.  .  The  t-ratios  are 

greater  than  1  over  almost  the  full  range  of  permissible  values  and  exceed  2.0 
over  considerable  portions  of  this  range. 
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Figure  5 


Table  7.-  Data  for  case  in  which  r12  =  r^,  =  r^  =  0.7, 


r23 


r2k  =  0.9,   an«  N  =  20 


1/  Identical  for  each  P. 

2/  Lowest  possible  value  of  r^. 

3_/  Highest  possible  value  of  rji^. 


012. 3^ 

P13-2l* 

Pll*.23 

J 

t -ratio  for  - 

r3^ 

P12.3l* 

|      p13-2l*      i 

Pll*.23 

0.6392  2/ 

-5.8586 

3.6U36 

3.61*36 

3.0000 

-1.9529 

1.211*5 

1.211*5 

.6500 

-3.5000 

2.3333 

2.3333 

2.2105 

-1.5831* 

1.0556 

1.0556 

.69OO 

-1.1000 

1.0000 

1.0000 

1.31*74 

-.8161* 

.71*22 

•  7!*22 

.69IH 

-1.0000 

,9kkk 

•  9W* 

1.3102 

-.7633 

.7208 

.7208 

.7000 

-.8750 

.8750 

.8750 

1.2627 

-.6930 

.6930 

.6930 

•  7500 

-.2692 

.5385 

.5385 

1.01*29 

-.2582 

.5163 

.5163 

.7720 

-.1290 

.1*605 

.1*605 

1.0000 

-.1290 

.1*605 

.1*605 

.8000 

0 

.3889 

.3889 

•  9721 

0 

.1*000 

.1*000 

.8500 

.1522 

.301*1* 

.301*1* 

.9825 

.151*9 

.3098 

.3098 

.861*5 

.181*7 

.2862 

.2862 

1.0000 

.181*7 

.2862 

.2862 

.9000 

.2500 

.2500 

.2500 

1.0827 

.2309 

.2309 

.2309 

•  9500 

.3182 

.2121 

.2121 

1.1*01*1* 

.2266 

.1510 

.1510 

1.0000  3_/ 
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Figure  6 


Table  8.-  Data  for  case  in  which  r.p  »  r... 


rH* 


0.7, 


r23  =  r2l*  =  °'7>   aad  "  =  20 


P12.3I* 

p13.2i* 

pll*.23 

U 

t-ratio  for 

- 

r3* 

p  12.31* 

;   P13.2U 

;    Bl«*.23 

0.1529  2/ 

-1.0000 

1.211*3 

1.21U3 

0.6532 

-1.5310 

1.8590 

I.8590 

.1900 

-  .7000 

1.0000 

1.0000 

.5782 

-1.2106 

1.7291* 

1.729* 

.2000 

-  .6361* 

.95*6 

.95^ 

.5625 

-1.131* 

1.6971 

1.6971 

.3000 

-  .2188 

.6562 

.6562 

.1*622 

-   .1*733 

1.1*199 

1.1*199 

.Uooo 

0 

.5000 

.'iOOO 

.1*167 

0 

1.2000 

1.2000 

.5000 

.13*6 

.1*038 

.1*038 

.3982 

.3381 

1.011*2 

1.011*2 

.5765 

.2071 

.3521 

.3521 

•3973 

.521* 

.8862 

.8862* 

.6000 

.2258 

.3387 

.3387 

.3992 

.5657 

.81*85 

.81*85 

.7000 

.2917 

.2917 

.2917 

.1*210 

.6928 

.6928 

.6928 

.8000                    ! 

.3M5 

.2561 

.2561 

.1*772 

.7155 

.5367 

•5367 

.9000 

.3801* 

.2283 

.2283 

.6309 

.6030 

.3618 

.3618 

.9592 

.3998 

.211*5 

.211*5 

.9520 

.1*199 

.2253 

.2253 

.9632                    ! 

.1*010 

.2136 

.2136 

1.0000 

.1*010 

.2136 

.2136 

.9800 

.1*060 

.2100 

.2100 

1.31*1*0 

.3021 

.1562 

.1562 

1.0000  2/        i 

— - 

--- 

•  —  ™ 

... 

... 

... 

"* 

l/  Identical  for  each  0 . 

2/  Lowest  possible  value  of  r,^. 

2/  Highest  possible  value  of  r,^. 
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Figure  7 


Table  9.-  Data  for  case  In  which  r^g 
r23  "  T2k  "  ®'5>   ^^  N 


0.7, 


20" 


c3k 


3 12.34 


513.24 


3 14.23 


t-ratio  for 


12.34 


13.24 


3 14.23 


-0.0196  2/     : 

-0.0286 

0.7286 

0.7286 

0.3572 

-O.0800 

2.0396 

2.0396 

0                   : 

0 

.7000 

.7000 

.3500 

0 

2.0000 

2.0000 

.1000            ! 

.1167 

.5833 

.5833 

.3224 

.3618 

1.8091 

1.8091 

.2000           : 

.2000 

.5000 

.5000 

.3062 

.6532 

1.6330 

1.6330 

.2500           : 

.2333 

.4667 

.4667 

.3012 

.7746 

1.5492 

1.5492 

.3000          : 

.2625 

.4375 
.3889 

.4375 

.2981 

.8806 

1.4676 

1.4676 

.4000          : 

.3111 

.3889 

.2970 

1.0474 

1.3093 

1.3093 

.5000          : 

.3500 

.3500 

.3500 

.3031 

1.1547 

1.1547 

1.1547 

.6000          : 

.3818 

.3182 

.3182 

.3182 

1.2000 

1.0000 

1.0000 

.7000          : 

.4083 

.2917 

.2917 

.3472 

1.1762 

.8J+02 

.8402 

.8000          s 

.4308 

.2692 

.2692 

.4038 

1.0667 

.6667 

.6667 

.9000          : 

.4500 

.2500 

.2500 

.5449 

.8259 

.4588 

.4588 

.9500          : 

.4586 

.2414 

.2414 

•7538 

.6084 

•  3202 

.3202 

.9800           ! 

.4635 

.2365 

.2365 

1.1763 

.3940 

.2010 

.2010 

1.0000  2/     : 

— - 

— 

— 

... 

""• 

"•" 

»•« 

1/  Identical  for  each  0. 

2/  Lowest  possible  value  of  r^. 

2/  Highest  possible  value  of  r^.. 
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EFFECTS  OF  INTERCORRELATION 

4-Variable  Case:  ri2  =  ri3  =  h4  =  .7;  T23=  r24  =  .l;  N=  20 
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Figure  8 


Table  11.-  Data  for  case  In  vbich  112  ■  13  ■  rl4  =  0-7> 
r23  =  r24  ■  0>1>  and  R  "  20 


012-34 

013.23 

P 14.23 

y 

t-ratio  for  - 

r^ 

e12.3l* 

\      P  13.21*       | 

0 14.23 

0.5765  2/ 

0.6191 

0.1*048 

0.1*048 

0.3079 

2.0105 

1.3145 

1.3145 

.6000 

.6202 

.3987 

.3987 

.3133 

1.9799 

1.2728 

1.2728 

.7000 

.6250 

•  3750 

•  3750 

.3455 

1 .0891 

I.0854 

I.0854 

.8000 

.6292 

•  3539 

•  3539 

.1*054 

1.5522 

.8731 

.8731 

.9000 

.6330 

•  3351 

•  3351 

.5507 

1.11*94 

.6085 

.6085 

■  9500 

.63^7 

•  3264 

•  3264 

.7641 

.8307 

.4272 

.4272 

1.0000  3_/      • 

... 

— 

— 

— 

— 

— 

— 

1/  Identical  for  each  0 . 

2/  Lowest  possible  value  of  r^. 

3_/  Highest  possible  value  of  r^^. 
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EFFECTS  OF  INTERCORRELATION 

4 -Variable  Case:  ri2  =  ri3  =  ri4  =  . 5;   r23=r24  =  .5;    N=20 
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Figure   9 


Table  12.-  Data-for  case  In  which  r.g  »  r-j,  =  r,^  =  0.5,   r2,  =  r2.    «  0.5,  and  N  «  20 


■    $2.31* 

&1U.23 

s3 

t-ratio  for 

r3* 

013 .2^ 

1/ 

: 

B12.3U 

:     013.2U 
t 

'•     elU.23 

: 

-0.3333  g/ 

:     -1.0000 

l.*999 

1.1*999 

0.5303 

-I.8856 

2.8285 

2.8285 

-  .3000 

-   .7500 

I.2500 

1.2500 

.U586 

-1.6353 

2.7255 

2.7255 

-  .2500 

-   .5000 

1.0000 

1.0000 

.3873 

-1.2910 

2.5820 

2.5820 

-  .2000 

-  .3333 

.8333 

.8333 

.3U02 

-   .9798 

2.W*95 

2.1*1*95 

-  .1000 

-  .1250 

.6250 

.6250 

.2827 

-   ,M*22 

2.2111 

2.2111 

0 

0 

.5000 

.5000 

.2500 

0 

2.0000 

2.0000 

.1000 

.0833 

M&i 

•.IU67 

.2303 

.3618 

1.8091 

1.8091 

.2000 

.11*29 

.3571 

.3571 

.2187 

.6532 

1.6329 

1.6329 

.3000 

.1875 

.3125 

.3125 

.2129 

.8806 

1.1*676 

I.U676 

.llOOO 

.2222 

.2778 

.2778 

.2122 

1.01*71* 

1.3093 

1.3093 

.5000 

.2500 

.2500 

.2500 

.2165 

1.15^7 

1.15^7 

1.15*7 

.6000 

.2727 

.2273 

.2273 

.2273 

1.2000 

1.0000 

1.0000 

.7000 

.2917 

.2083 

.2083 

.21*80 

1.1762 

.8U01 

.81*01 

.8000 

.3077 

.1923 

.1923 

.2885 

1.0667 

.6667 

.6667 

.9000 

•3214 

.•1786 

.1786 

.3892 

.8259 

.1*588 

.1*588 

1.0000  2/ 

1 

1 

"™" 

•  -  — 

--- 

— — — 

... 

... 

1/  Idantioal  for  aaoh  $ . 

2/  Lowest  possible  ralua  of  r,^  . 

3/  Eifaest  possible  value  of  r,^. 
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EFFECTS  OF  INTERCORRELATION 

4-Variable  Case:  Ti2=  ri3  =  ri4  =  .5;  T23*=  T24  =  0;  N  =  20 
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Figure  10 


Table  13.-  Data  for  case  in  which  ri_g  ■  *\1   «  r^  «  0.5> 
r23  "  TZk   "  °»  and  R  "  20 


r_i 

P13.21* 

BlU.23 

h 

t-ratio  for  - 

31* 

012. 31* 

u 

e12.3** 

:       ^13. 24       : 

6 14 .23 

-0.3333  2/ 

0.5000 

0.7500 

0.7500 

0.2652 

I.8856 

2.8284 

2.8284 

-    .3000 

.5000 

.711*3 

.711*3 

.2574 

1.9429 

2.7756 

2.7756 

-    .2500 

.5000 

.6667 

.6667 

.2472 

2.0226 

2.6968 

2.6968 

-    .2000 

.5000 

.6250 

.6250 

.2357 

2.091+9 

2.6186 

2.6186 

-    .1000 

.5000 

.5556 

.5556 

.2255 

2.2172 

2.4636 

2.1*636 

0 

.5000 

.5000 

.5000 

.2165 

2.309t 

2.309U 

2.3091* 

.1000 

.5000 

.14.51*6 

.U5U6 

.2109 

2.3708 

2.1553 

2.1553 

.2000 

.5000 

.1*167 

.1*167 

.2083 

2.4000 

2.0000 

2.0000 

.3000 

.5000 

.  331*6 

.3846 

.2088 

2.3950 

1.81*23 

1.81*23 

.fcooo 

.5000 

.3571 

.3571 

.2125 

2.3525 

I.6803 

1.6803 

.5000 

.5000 

.3333 

.3333 

.2205 

2.2678 

1.5H8 

1.5118 

.6000 

.5000 

.3125 

.3125 

.2344 

2.1333 

1.3333 

1.3333 

.7000 

.5000 

.29M 

.291+1 

.2582 

1.9363 

1.1390 

1.1390 

.8000                 ! 

.5000 

.2778 

.2778 

.3027 

1.6518 

.9177 

.9177 

.9000          i 

.5000 

.2632 

.2632 

.1*109 

1.2170 

.61*05 

.61*05 

1.0000  37 

*"- 

""" 

""" 

... 

■"" 

""- 

•-  • 

1/  Identical  for  each  p. 

2/  Lowest  possible  ralue  of  r,.  . 

l/  Highest  possible  yalue  of  r  . . 
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APPENDIX 


Note  1  -  Limits  Imposed  on  Values  of  r2~  by  Given  Values  of  r.2  and  r2_ 

By  definition,  no  simple  or  multiple  correlation  coefficient  can  exceed 
1  in  absolute  value. 

Let  us  repeat  text  equation  (2),  as  follows, 


2        i-    +  r,-  "  2r,„  r_  r. 


2     2 

*1.23  =    rl2  *  r13  J  ^12  r13  x23  ,  (2) 

2 
1  -  r 

23 


2 
noting  that  E     must  lie  between  1  and  0. 

2 
If  E     =1,  we  have 


1  ■  r23£  '  **  *  'is'  "  '""I*  f13  r23-  f2-1' 

Rearranging  terms,  we  have 

r23  "  2  rL2  r13  r23  +  (r12  +  r^2  -  1)  ,  0  (2#2) 

and,  using  a  standard  formula  of  elementary  algebra,  we  obtain 

r23  =  r12  ri3  *  V7  (1"r122f)  (l-r132).  (2.3) 

If  r12  »0.9  and  r^  *  0.7,  as  in  figure  3,   we  have 

r23  =  0.63  t    \J   (0.19)  (0.51)  =  0.63  ±  0.3113; 

hence,  r^  .  O.3187  or  O.9U13.  Substituting  these  values  back  in  equation 
(2),  we  obtain  p?  2^  »  1  in  each  case. 

Only  if  r,0  =  r   can  r   reach  the  maximum  value  of  1.  But  when 
"12    13     23 
2      2 
r2"3  =  *'  ^1  23  =  r12»  whlcl1  i8>  *n  general,  less  than  1.  This  is  shown  in 

figures  1  and  2. 


-  2k  - 


Thus,  if  r   =  r   =  0.7,  equation  (2.3)  gives  us 

r23  =  0.49  +  0.51  =  1  or  -  0.02. 
Substituting  -  0.02  for  r_  in  equation  (2)  we  obtain 


K  01   -  0.98  -  0.98  (-  0.02)  =  0.9996  =  1  . 
L'   *  1  -(-0.02)2 


But  if  we  substitute  r   =  1  we  obtain 


\   2  -  0^8  -  0  98(1)   =  0  , 
1  -(1  )         ° 


an  indeterminate  value. 


This  indeterminacy  can  be  resolved  by  applying  L'Hopital's  rule,  2/ 
from  which  we  obtain 


K2 


i.a3  =  -r§$  *  °M  ("  TZ2)  • 


Note  2.  Minimum  Values  of  Specified  Correlation  and  Regression  Measures 

Assume  that  we  are  given  the  values  of  r^  and  r..-  and  wish  to  obtain  the 
values  of  r2,  at  which  various  coefficients  reach  their  minimum  values  within 
the  permissible  ranges  in  which  (1)  E2  __  is  equal  to  or  less  than  1  and  (2) 

T2o   is  equal  to  or  less  than  1  in  absolute  value.  A  necessary  condition  for 
a  minimum  is  that  the  partial  derivative  with  respect  to  r   of  a  measure  that 


is  a  function  of  r-«  t>e  zero. 


23 


1.  Starting  with  equation  (2), 

,2         _       „     2    ,   _     2^ 

(2) 


B1.23  =      r12     +  r13  "2r12  rn  r23 


23 


2 

we  find  that  P.  p.  reaches  a  minimum  when 


2/  See  pp.  15-16  of  Woods,  Frederick  S.,  Advanced  Calculus,  new  ed., 
397  PP.*  Illus.,  New  York,  193^ >   or  other  standard  calculus  texts. 


r 
23 
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,r  2   r  2       2      2 

(r12  +rl3  )  ±  (Jig  -  r132)|  (2.U) 


2  r12  r13 


that  Is,  when  r2~  =  r12   or  ri3 


r13      r12 


r12  *  ^12*  • 

-132 

r13 

r13  *\/r132  ' 

2 

'r12 

When  r^  ■  r^  (  as  in  figures  1  and  2),  ^i^   reaches  its  minimum  value 
when  rgo  =  1.  When  r,-  ^  r^o,  only  one  of  the  two  values  of  r2o  given  by 

equation  (2.U)  will  he  less  than  1,  and  hence  a  permissible  value.  In  figure  3> 
we  have  r2o  =  0.7  «  O.78;  in  the  data  shown  in  table  k,   r2o  =  0.56;  and  so  on. 

J   0.9  3 

It  is  clear  that  these  are  minimum  rather  than  maximum  values. 

2.  Using  the  same  approach,  the  minima  for  the  beta  coefficients, 
^12.3  and-^13.2  respectively,  are  given  at  the  points  where 

r23  =  ■-**   *  v-**  -  1U  (3.2) 

and 

r23  '   *J   =  =* ^-  •  (U.2) 

r 
12 

If  r„„  »  r._,  the  two  beta  coefficients  are  identical  and  reach  their 
12    13 
low  point  when  r2-  =  1.  If  r,o  >  r^2,  the  low  point  for(9,2  -  is  imaginary, 

while  that  for  £, ,  2  occurs  at  a  value  somewhat  greater  than   r13  .  The 

r12 
converse  is  true  if  r12>r1^.  in  the  data  shown  in  table  k,   for  example, 
^13.2  ^as  no  nd-11*11™111  value  in  the  permissible  range;  £tp  o  has  a  minimum  value 
at  the  point 

r23  ■  o»9  -  y7  0.81  -  ""oTI^  »  0.900  1    0.7^8  -  0.30U  . 
0.5  0.5 

The  second  value  of  r2-  is  outside  the  permissible  range. 

A  prominent  feature  of  cases  3  to  6  is  the  fact  that£,-  2  changes  sign 
at  a  value  of  r2«  somewhat  greater  than  that  of  r,,.  Referring  to  equation  (k) , 

P13.2  -   ri3  :  rig  r23    ,  (4) 

1  -  r232 


-  26  - 
we  see  that  the  sign  Is  determined  "by  the  numerator  only. 


if   r__  „  r 


23  <  -^  ,  P  13.2  >  °»  «4 


12 

3.  The  minima  for  the  partial  correlation  coefficients,  r  2  .  and  r^,  2, 
respectively  are  given  at  the  points  where 

r23 =  5a  (5.D 

r12 


and  rr>-,  =  r12 


23  '=  ^7  (6.1) 


In  tables  3  through  6,  r^  <  r^  and  r,-  p  has  no  minimum  in  the  permissible 

range  of  values  for  r2o.  However,  r-2  «  has  a  minimum  at  the  level  indicated 

"by  equation  (5.1)  • 

When  r,0  =  rn_,  x*   _  and  r   ^  are  equal  and  reach  their  lowest  point  in 
12    13   12.3     13.2 

the  permissible  range  when  r2~  =  1. 

V.  Finally,  the  minimum  value  of  the  standard  errors  of  the  "beta 
coefficients  can  "be  obtained  easily  from  formula  (7)> 


V3  a  X3.2  -  -Ma (7) 

a-r232),/irr~ 


»2 


When  R.  p,  »  1,  Sfi     ■  0  (provided  r  *  is  less  than  1).  Thus,  in 
'  s  p12.3  23 

tables  3  through  6,  S^    ■  0  at  two  points,  one  at  each  end  of  the  permissible 

12.3 

range  of  values  of  rp-.  in  figures  1  and  2,  Sp     ■  0  at  one  point,  the 

*  p12.3 
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lowest  permissible  value  of  r 


Figures  1  and  2  indicate  that,  if  r   =  r^,   S^     approaches  infinity 
as  r^o  approaches  1.  This  follows  readily  from  equation  (7),  since  if  1R~         i8 
less  than  1  at  the  point  where  rp?  =  1,  we  have  a  real  number  divided  by  zero. 

If  we  set     ftl2.3  =  0  we  obtain 
r23 

r233  "  3r12  r13  r^2  +  /2(r122  +  r132)  -  if  r23  -  r^  r13  -  0  .     (7.3) 


For  given  values  of  r   and  r.,_  this  can  be  solved  most  readily  by  plotting  the 
values  of  the  function  fc 
expression  (7.3)  becomes 


values  of  the  function  for  a  range  of  values  of  r     Thus,  for  table  5,  the 


y   3  2 

r23  -  0.81  r23  +  0.8  r23  -  0.27  «  0  . 

This  equals  zero  when  r?_  is  approximately  0.U24.  It  is  clear  from  the  table 
that  this  is  a  maximum,  or  upper  turning  point,  rather  than  a  minimum. 

Note  3«  Relation  of  Correlation  Formulas  in  Determinant  Notations  to  the  Pjj 

Table  or  Inverse  Correlation  Matrix 

The  elements  of  the  inverse  of  a  matrix  of  simple  correlation  coefficients 
may,  in  the  three -variable  case,  be  written  as  follows; 


(3) 


-Aig 

■Ag3 

~2T 


An 


The  array  of  elements  in  the  inverse  is  often  referred  to  as  "  the  P.,  table" 
in  computation  methods  such  as  those  developed  by  Waugh  kj .  In  this  notation, 
Pll  =  ^11  '   P12  =  ~Z^12  ,  and  so  on. 

hj   Waugh,  F.V.  A  Simplified  Method  of  Determining  Multiple  Regression 
Constants,  Amer.  Statis.  Assoc.  Jour.  30:69^-700.  1935. 
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To  calculate  ^,g  ,  from  the  Inverse  or  P.,  table,  we  divide  -P.^  hy  P.., 
This  is  equivalent  to  equation  (3*); 


P 


12.3 


-pl2  .A 


12   . 


11 


~K 


A 


A.2  . 


(3») 


4i         4n 

All  the  other  formulas  in  the  text  can  he  derived  in  the  same  fashion 

2  =1-1  =  1-       A 


from  the  P       tahle.     For  example. 


1.23  -5- 


-P 


12.3 


12 


"11         4li 
^12  A 

^~~  =         A  12 


(2") 


(5») 


and  so  on. 


The  text  formulas  in  determinant  notation  can  he  generalized  for  any 
number  of  variables.  The  same  is  true  of  the  corresponding  formulas  in  the 

Pj*  notation.  Thus, 


012.3U-  "Pl2 


=  ..J^12  .     A 


A 


12 


(10) 


AT 


where  P..  and  P.  are  the  first  and  second  elements  in  the  first  row  of  a 


^-rowed  determinant, 


CO 


4* 


,   X  j   J   a  A,   ....   *t. 


